Kalman Temporal Differences
نویسندگان
چکیده
منابع مشابه
Kalman Temporal Differences
Because reinforcement learning suffers from a lack of scalability, online value (and Q-) function approximation has received increasing interest this last decade. This contribution introduces a novel approximation scheme, namely the Kalman Temporal Differences (KTD) framework, that exhibits the following features: sample-efficiency, non-linear approximation, non-stationarity handling and uncert...
متن کاملKalman Temporal Differences: Uncertainty and Value Function Approximation
This paper deals with value (and Q-) function approximation in deterministic Markovian decision processes (MDPs). A general statistical framework based on the Kalman filtering paradigm is introduced. Its principle is to adopt a parametric representation of the value function, to model the associated parameter vector as a random variable and to minimize the mean-squared error of the parameters c...
متن کاملSample Efficient On-Line Learning of Optimal Dialogue Policies with Kalman Temporal Differences
Designing dialog policies for voice-enabled interfaces is a tailoring job that is most often left to natural language processing experts. This job is generally redone for every new dialog task because cross-domain transfer is not possible. For this reason, machine learning methods for dialog policy optimization have been investigated during the last 15 years. Especially, reinforcement learning ...
متن کاملKalman-filter based spatio-temporal disparity integration
Vision-based applications usually have as input a continuous stream of data. Therefore, it is possible to use the information generated in previous frames to improve the analysis of the current one. In the context of video-based driver-assistance systems, objects present in a scene typically perform a smooth motion through the image sequence. By considering a motion model for the ego-vehicle, i...
متن کاملSpeech Enhancement in Temporal Kalman Filt
In this paper a time-frequency estimator for enhancement of noisy speech signals in the DFT domain is introduced. This estimator is based on modelling and filtering the temporal trajectories of the DFT components of noisy speech signal using Kalman filters. The time-varying trajectory of the DFT components of speech is modelled by a low order autoregressive process incorporated in the state equ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Artificial Intelligence Research
سال: 2010
ISSN: 1076-9757
DOI: 10.1613/jair.3077